502 research outputs found
Boosting Adversarial Transferability by Block Shuffle and Rotation
Adversarial examples mislead deep neural networks with imperceptible
perturbations and have brought significant threats to deep learning. An
important aspect is their transferability, which refers to their ability to
deceive other models, thus enabling attacks in the black-box setting. Though
various methods have been proposed to boost transferability, the performance
still falls short compared with white-box attacks. In this work, we observe
that existing input transformation based attacks, one of the mainstream
transfer-based attacks, result in different attention heatmaps on various
models, which might limit the transferability. We also find that breaking the
intrinsic relation of the image can disrupt the attention heatmap of the
original image. Based on this finding, we propose a novel input transformation
based attack called block shuffle and rotation (BSR). Specifically, BSR splits
the input image into several blocks, then randomly shuffles and rotates these
blocks to construct a set of new images for gradient calculation. Empirical
evaluations on the ImageNet dataset demonstrate that BSR could achieve
significantly better transferability than the existing input transformation
based methods under single-model and ensemble-model settings. Combining BSR
with the current input transformation method can further improve the
transferability, which significantly outperforms the state-of-the-art methods
Optimal Variable Speed Limit Control Strategy on Freeway Segments under Fog Conditions
Fog is a critical external factor that threatens traffic safety on freeways.
Variable speed limit (VSL) control can effectively harmonize vehicle speed and
improve safety. However, most existing weather-related VSL controllers are
limited to adapt to the dynamic traffic environment. This study developed
optimal VSL control strategy under fog conditions with fully consideration of
factors that affect traffic safety risks. The crash risk under fog conditions
was estimated using a crash risk prediction model based on Bayesian logistic
regression. The traffic flow with VSL control was simulated by a modified cell
transmission model (MCTM). The optimal factors of VSL control were obtained by
solving an optimization problem that coordinated safety and mobility with the
help of the genetic algorithm. An example of I-405 in California, USA was
designed to simulate and evaluate the effects of the proposed VSL control
strategy. The optimal VSL control factors under fog conditions were compared
with sunny conditions, and different placements of VSL signs were evaluated.
Results showed that the optimal VSL control strategy under fog conditions
changed the speed limit more cautiously. The VSL control under fog conditions
in this study effectively reduced crash risks without significantly increasing
travel time, which is up to 37.15% reduction of risks and only 0.48% increase
of total travel time. The proposed VSL control strategy is expected to be of
great use in the development of VSL systems to enhance freeway safety under fog
conditions
A Causal View of Entity Bias in (Large) Language Models
Entity bias widely affects pretrained (large) language models, causing them
to rely on (biased) parametric knowledge to make unfaithful predictions.
Although causality-inspired methods have shown great potential to mitigate
entity bias, it is hard to precisely estimate the parameters of underlying
causal models in practice. The rise of black-box LLMs also makes the situation
even worse, because of their inaccessible parameters and uncalibrated logits.
To address these problems, we propose a specific structured causal model (SCM)
whose parameters are comparatively easier to estimate. Building upon this SCM,
we propose causal intervention techniques to mitigate entity bias for both
white-box and black-box settings. The proposed causal intervention perturbs the
original entity with neighboring entities. This intervention reduces specific
biasing information pertaining to the original entity while still preserving
sufficient semantic information from similar entities. Under the white-box
setting, our training-time intervention improves OOD performance of PLMs on
relation extraction (RE) and machine reading comprehension (MRC) by 5.7 points
and by 9.1 points, respectively. Under the black-box setting, our in-context
intervention effectively reduces the entity-based knowledge conflicts of
GPT-3.5, achieving up to 20.5 points of improvement of exact match accuracy on
MRC and up to 17.6 points of reduction in memorization ratio on RE. Our code is
available at https://github.com/luka-group/Causal-View-of-Entity-Bias.Comment: Findings of EMNLP 202
Adaptive Test-Time Personalization for Federated Learning
Personalized federated learning algorithms have shown promising results in
adapting models to various distribution shifts. However, most of these methods
require labeled data on testing clients for personalization, which is usually
unavailable in real-world scenarios. In this paper, we introduce a novel
setting called test-time personalized federated learning (TTPFL), where clients
locally adapt a global model in an unsupervised way without relying on any
labeled data during test-time. While traditional test-time adaptation (TTA) can
be used in this scenario, most of them inherently assume training data come
from a single domain, while they come from multiple clients (source domains)
with different distributions. Overlooking these domain interrelationships can
result in suboptimal generalization. Moreover, most TTA algorithms are designed
for a specific kind of distribution shift and lack the flexibility to handle
multiple kinds of distribution shifts in FL. In this paper, we find that this
lack of flexibility partially results from their pre-defining which modules to
adapt in the model. To tackle this challenge, we propose a novel algorithm
called ATP to adaptively learns the adaptation rates for each module in the
model from distribution shifts among source domains. Theoretical analysis
proves the strong generalization of ATP. Extensive experiments demonstrate its
superiority in handling various distribution shifts including label shift,
image corruptions, and domain shift, outperforming existing TTA methods across
multiple datasets and model architectures. Our code is available at
https://github.com/baowenxuan/ATP .Comment: Accepted by NeurIPS 202
- …